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The particularities of multiple regression as a method for mathematically describing the technological pro¬ 
cesses in glass production are examined. The procedure for choosing the structure of the regression model 
based on analysis of the variation of the dependent variable under the conditions of normal flow of the process 
being analyzed is described. The efficacy of the procedure proposed for describing the technological process 
of glass ribbon formation in a float bath is shown. 
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Effective control of the technological processes in glass 
production is possible when the basic mechanisms of the 
process are represented mathematically. The technological 
processes in glass production are complex objects of control. 
The regime variables and quality parameters of the glass de¬ 
pend on numerous controllable and uncontrollable factors. 
For this reason obtaining a serviceable mathematical descrip¬ 
tion based on statistical data on the functioning of an object 
under normal operating conditions is a complex research 
problem. 

The literature on the practical application of multiple re¬ 
gression shows that in most cases researchers use a regres¬ 
sion model with independent normal errors and equally accu¬ 
rate measurements to construct a statistical mathematical de¬ 
scription of continually operating objects [1]. 

The construction of a statistical mathematical description 
of an object consists in finding relations between each of the 
output variables of the object and all other controllable input 
variables. The desired relation is sought on the basis of statis¬ 
tical data representing the results of measurements of the 
variables of the object in a normal operating regime. 

To construct the exact regression equations it is neces¬ 
sary to know the conditional distribution law for the output 
parameters. In the practice of statistics such information usu¬ 
ally cannot be obtained; instead, suitable approximations are 
sought for the function /(x b x 2 ,..., x k ) describing the depend¬ 
ence of a conditional average value of the output variable y 
on the prescribed values of the arguments x b x 2 , ..., x k . If the 
multidimensional random quantity (y, x b x 2 ,..., x k ) satisfies a 
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(A'+ l)-dimensional normal distribution law, then the regres¬ 
sion equation of the output variable y with respect to the con¬ 
trollable input variables x b x 2 , x k is linear in x [2]. How¬ 
ever, in the practice of statistics attention is confined to 
searching for suitable approximations for the unknown true 
regression function/(x b x 2 , x k ), since the researcher does 
not know the exact probability distribution of the output vari¬ 
able y being analyzed for known values of the arguments x b 
x 2 , ...,x k . 

If the wrong class is chosen for the regression function, 
the statistical results and estimates will not be consistent and 
increasing the number of observations will not give an esti¬ 
mate of y closer to the true regression function /(x b x 2 , ..., 
x k ). If the correct class is chosen for the regression function, 
the inaccuracy in the description of the true regression func¬ 
tions with the aid of the y is explained by the limited nature 
of the sample of the statistical observations [2]. The best re¬ 
construction of the output variable of the object y(x l , x 2 , ..., 
x k ) and the unknown regression function is obtained on the 
basis of the statistical data by the least-squares method. 

In the regression method the choice of structure is one of 
the important problems having a significant effect on the ac¬ 
curacy of the model. There are only a few methods for 
choosing the structure of a regression model. In practice lin¬ 
ear models are insufficient for describing real objects. For 
this reason, when choosing a structure for the model the de¬ 
gree of nonlinearity of the observational data is determined 
using the apparatus of the correlation theory of random func¬ 
tions. For a multidimensional object the degree of non¬ 
linearity of the dependent variable y relative to x b x 2 , ..., x k is 
determined as the difference of the squares of multivariate, 
mutually nonnalized, dispersion and correlation functions [3], 
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In practice the form of the regression equation is chosen 
on the basis of an analysis of the physical essence of the ob¬ 
ject being studied and the observational results. 

We shall examine a procedure for choosing the structure 
of the regression model on the basis of an analysis of the dis¬ 
persion (variation) of the output variable of the object being 
analyzed. The function y(x x , x 2 , ..., x k ) can be represented 
graphically by the response surface in the (k + 1 (-dimen¬ 
sional space of the arguments x u x 2 , ..., x k . Choosing a struc¬ 
ture for the model means searching for a suitable approxima¬ 
tion describing the dependence of the conditional average 
value of the output variable y on the prescribed values of the 
arguments x,, x 2 ,..., x k . If the variation of the dependent vari¬ 
able y is very small in the observational sample, the condi¬ 
tional average value of the output variable y can be approxi¬ 
mated with acceptable accuracy by a linear surface. 

Such an approximation will bring the sample estimate y 
close to the true regression function /(x,, x 2 , ..., x k ) if the 
variation of the dependence variable is significant. This 
makes it necessary to use polynomial regression to describe 
the data being analyzed. In most cases polynomials of degree 
no higher than second are used to describe stationary techno¬ 
logical processes. 

In constructing a statistical mathematical description the 
researcher is faced with the problem of choosing a linear or 
nonlinear multidimensional function. It is necessary to deter¬ 
mine which function is the true one and to calculate the pa¬ 
rameters of the model chosen. 

In practice it is important to know how well the model 
corresponds to the object being described and whether or not 
the prediction accuracy should be increased. The value of the 
multiple correlation coefficient of the regression equation 
such that R > 0.86 could provide answers to these questions. 
The error of prediction based on the regression equation will 
be a factor of 2 smaller than the error of prediction based on 
the average value of the dependent variable y. Any measure 
that gives a very small increase in the multiple correlation 
coefficient beyond the limit R = 0.86 is justified in practice 
because the serviceability of the regression equation can in¬ 
crease considerably [1], 

The following algorithm is proposed for choosing the 
structure of the regression equation. 

1. Determine the coefficient of variation of the depend¬ 
ent variable y in the statistical sample used for constructing 
the model. 

2. For a large coefficient of variation (> 20 - 30%) con¬ 
struct linear and nonlinear regression models. Check the ade¬ 
quacy of the models developed. 

3. Compare the accuracy of the models developed ac¬ 
cording to the magnitude of the residual dispersion according 
to the Fisher criterion: 

max (i ’| 2 ,s\) 

1 =-j-9 ’ 

min (Sj ,s 2 ) 

where s, 2 and si are the residual dispersions of the models 
being compared. 


Compare the computed value of the criterion with the 
tabulated value for the significance level a = 0.05 and num¬ 
ber of degrees of freedom (n i - Aq - 1), (n 2 — k 2 — 1). If the 
tabulated value of the criterion is greater than the computed 
value, then a model with the smaller residual dispersion is 
chosen. Otherwise the models are taken to be of equal accu¬ 
racy and preference is given to the simpler model, viz., the 
linear model. 

4. For multiple correlation coefficients R < 0.86 for the 
chosen model take measures to increase the accuracy of the 
model. Such measures could be increasing the size of the ini¬ 
tial sample used to construct the regression model, including 
additional factors in the model, choosing a different structure 
for the model and others. 

We shall now examine a mathematical description of the 
technological process leading to the fonnation of the glass 
ribbon in a float tank. Optical distortions of the glass are 
largely determined by the formation regime [4]. The specifi¬ 
cations given in GOST 111-2001 (Sheet Glass: Technical 
Conditions) do not allow distortions of ‘zebra’ bands for M0 
grade glass at angles < 50°. For optical distortions visible in 
reflected light for M0 grade glass, distortions of the index of 
a reflected raster greater than 3 mm are not allowed. 

A sample comprised of 365 measurements of the average 
daily indices of optical distortions of the sheet glass being 
produced and the technological operating regimes of the 
float bath was used to construct a regression model. The fol¬ 
lowing were studied as the influential variables: the tempera¬ 
ture in the passages in the float bath, daily fluctuations in the 
glass density and the change in the thickness of the glass pro¬ 
duced. The quality indices for glass formation were taken to 
be the optical distortions visible in transmitted light (‘zebra’) 
and distortions visible in reflected light (raster). 

The coefficients of variation of the optical distortions of 
the glass in the statistical sample were 11.8% in terms of the 
zebra index and 63.5% in terms of the raster. The coefficient 
of variation of the reflected raster index is significant, and for 
this reason an analysis using a nonlinear regression model is 
desirable. 

Linear and nonlinear models were constructed and as¬ 
sessments of the models are presented in Table 1. As one can 
see from the tabulated data a quadratic polynomial regres¬ 
sion model gives a more accurate description of the depend¬ 
ence of the deviations of the index of the reflected raster on 
the formation regime and the thickness of the glass ribbon 
formed. The effect of all other factors was found to be statis¬ 
tically insignificant. The nonlinear regression equation de¬ 
scribing the deviations of the reflected raster index Ra has 
the fonn 

Ra = -455.57 - 4.915 + 0.9380, + 0.575 2 - O.OOO470(, 

where 5 is the thickness (in mm) of the glass produced and 0, 
is the temperature in the first passage of the float bath, °C. 
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TABLE 1 . Estimates of Raster Regression Models 



Structure of the model 

Index 

linear 

quadratic polynomial 

Number of factor variables 

4 

2 

Multiple coefficient of re¬ 
gression 

0.53 

0.77 

Significance of the multiple 
coefficient of regression 

Significant 

Significant 

Residual dispersion of the 
model, mm 2 

8.7 

0.64 


The coefficient of variation of the optical distortions of 
the glass is negligible (11.8%) in terns of the zebra index in 
the statistical sample. This made it possible to give an ade¬ 
quate description of the process of linear regression in terms 
of a model of the form 

‘zebra’ = -34.97 + 5.915 + 0.1180, - O.O850 2Q , 

where 0, and 0 2O are the temperatures in the first and 20th 
passages of the float bath, °C. 

The multiple correlation coefficient of the model equals 
R = 0.84 and the residual dispersion is 11 (°C) 2 . The use of a 


quadratic polynomial structure made it impossible to in¬ 
crease the accuracy of the model of optical distortions of the 
glass in terms of the ‘zebra’ index. 

If necessary, the accuracy of the models developed can 
be increased by including in the model additional factors af¬ 
fecting the process resulting in the formation of a glass rib¬ 
bon in the float-bath. 

Analysis of the variation of the indices of the technologi¬ 
cal process resulting in the formation of a glass ribbon in a 
float bath made it possible to choose regression models that 
adequately describe the dependence of the optical distortions 
on the glass thickness and the formation regime. 
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